AITopics | proficiency level

Collaborating Authors

proficiency level

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Classifying German Language Proficiency Levels Using Large Language Models

Ahlers, Elias-Leander, Brunsmann, Witold, Schilling, Malte

arXiv.org Artificial IntelligenceDec-9-2025

Assessing language proficiency is essential for education, as it enables instruction tailored to learners needs. This paper investigates the use of Large Language Models (LLMs) for automatically classifying German texts according to the Common European Framework of Reference for Languages (CEFR) into different proficiency levels. To support robust training and evaluation, we construct a diverse dataset by combining multiple existing CEFR-annotated corpora with synthetic data. We then evaluate prompt-engineering strategies, fine-tuning of a LLaMA-3-8B-Instruct model and a probing-based approach that utilizes the internal neural state of the LLM for classification. Our results show a consistent performance improvement over prior methods, highlighting the potential of LLMs for reliable and scalable CEFR classification.

cefr level, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.06483

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.54)

Industry: Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Animating Language Practice: Engagement with Stylized Conversational Agents in Japanese Learning

Rackauckas, Zackary, Hirschberg, Julia

arXiv.org Artificial IntelligenceDec-3-2025

We explore Jouzu, a Japanese language learning application that integrates large language models with anime-inspired conversational agents. Designed to address challenges learners face in practicing natural and expressive dialogue, Jouzu combines stylized character personas with expressive text-to-speech to create engaging conversational scenarios. We conducted a two-week in-the-wild deployment with 52 Japanese learners to examine how such stylized agents influence engagement and learner experience. Our findings show that participants interacted frequently and creatively, with advanced learners demonstrating greater use of expressive forms. Participants reported that the anime-inspired style made practice more enjoyable and encouraged experimenting with different registers. We discuss how stylization shapes willingness to engage, the role of affect in sustaining practice, and design opportunities for culturally grounded conversational AI in computer-assisted language learning (CALL). By framing our findings as an exploration of design and engagement, we highlight opportunities for generalization beyond Japanese contexts and contribute to international HCI scholarship.

artificial intelligence, chatbot, natural language, (15 more...)

arXiv.org Artificial Intelligence

2507.06483

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Examining the Usage of Generative AI Models in Student Learning Activities for Software Programming

Chen, Rufeng, Jiang, Shuaishuai, Shen, Jiyun, Moon, AJung, Wei, Lili

arXiv.org Artificial IntelligenceNov-18-2025

Abstract--The rise of Generative AI (GenAI) tools like Chat-GPT has created new opportunities and challenges for computing education. Existing research has primarily focused on GenAI's ability to complete educational tasks and its impact on student performance, often overlooking its effects on knowledge gains. In this study, we investigate how GenAI assistance compares to conventional online resources in supporting knowledge gains across different proficiency levels. We conducted a controlled user experiment with 24 undergraduate students of two different levels of programming experience (beginner, intermediate) to examine how students interact with ChatGPT while solving programming tasks. We analyzed task performance, conceptual understanding, and interaction behaviors. Our findings reveal that generating complete solutions with GenAI significantly improves task performance, especially for beginners, but does not consistently result in knowledge gains. Importantly, usage strategies differ by experience: beginners tend to rely heavily on GenAI toward task completion often without knowledge gain in the process, while intermediates adopt more selective approaches. We find that both over-reliance and minimal use result in weaker knowledge gains overall. Based on our results, we call on students and educators to adopt GenAI as a learning rather than a problem solving tool. Our study highlights the urgent need for guidance when integrating GenAI into programming education to foster deeper understanding. The rapid development of Generative Artificial Intelligence (GenAI) has led to its widespread adoption across various domains to boost productivity and streamline workflows. Large Language Models (LLMs), such as OpenAI's ChatGPT and Codex, Google Gemini, and GitHub Copilot, have been integrated into domains including software engineering [1], [2], healthcare [3], education [4], creative writing [5], [6], and digital music [7], offering capabilities such as code generation, question answering, and image generation. These authors contributed equally to this work. Some studies evaluated GenAI's performance on programming tasks [8], user interface design education [9], and computer vision coursework [10]. Others focused on assessing the accuracy and usability of GenAIgenerated responses [11], [12].

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.13271

Country:

North America > United States (0.30)
North America > Canada > Quebec > Montreal (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Education > Educational Setting (0.48)
Education > Educational Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

LangLingual: A Personalised, Exercise-oriented English Language Learning Tool Leveraging Large Language Models

Gupta, Sammriddh, Singh, Sonit, Joshi, Aditya, Kim, Mira

arXiv.org Artificial IntelligenceOct-28-2025

Language educators strive to create a rich experience for learners, while they may be restricted in the extend of feedback and practice they can provide. We present the design and development of LangLingual, a conversational agent built using the LangChain framework and powered by Large Language Models. The system is specifically designed to provide real-time, grammar-focused feedback, generate context-aware language exercises and track learner proficiency over time. The paper discusses the architecture, implementation and evaluation of LangLingual in detail. The results indicate strong usability, positive learning outcomes and encouraging learner engagement.

langlingual, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.23011

Country:

Asia (1.00)
Oceania > Australia (0.29)
North America > United States (0.29)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (0.69)
Education > Educational Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

BLiSS 1.0: Evaluating Bilingual Learner Competence in Second Language Small Language Models

Gao, Yuan, Salhan, Suchir, Caines, Andrew, Buttery, Paula, Sun, Weiwei

arXiv.org Artificial IntelligenceOct-23-2025

To bridge the gap between performance-oriented benchmarks and the evaluation of cognitively inspired models, we introduce BLiSS 1.0, a Benchmark of Learner Interlingual Syntactic Structure. Our benchmark operationalizes a new paradigm of selective tolerance, testing whether a model finds a naturalistic learner error more plausible than a matched, artificial error within the same sentence. Constructed from over 2.8 million naturalistic learner sentences, BLiSS provides 136,867 controlled triplets (corrected, learner, artificial) for this purpose. Experiments on a diverse suite of models demonstrate that selective tolerance is a distinct capability from standard grammaticality, with performance clustering strongly by training paradigm. This validates BLiSS as a robust tool for measuring how different training objectives impact a model's alignment with the systematic patterns of human language acquisition.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.19419

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.55)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning

Kikuchi, Masato, Ono, Masatsugu, Soga, Toshioki, Tanabe, Tetsu, Ozono, Tadachika

arXiv.org Artificial IntelligenceOct-22-2025

Although WordNet is a valuable resource owing to its structured semantic networks and extensive vocabulary, its fine-grained sense distinctions can be challenging for second-language learners. To address this, we developed a WordNet annotated with the Common European Framework of Reference for Languages (CEFR), integrating its semantic networks with language-proficiency levels. We automated this process using a large language model to measure the semantic similarity between sense definitions in WordNet and entries in the English Vocabulary Profile Online. To validate our method, we constructed a large-scale corpus containing both sense and CEFR-level information from our annotated WordNet and used it to develop contextual lexical classifiers. Our experiments demonstrate that models fine-tuned on our corpus perform comparably to those trained on gold-standard annotations. Furthermore, by combining our corpus with the gold-standard data, we developed a practical classifier that achieves a Macro-F1 score of 0.81, indicating the high accuracy of our annotations. Our annotated WordNet, corpus, and classifiers are publicly available to help bridge the gap between natural language processing and language education, thereby facilitating more effective and efficient language learning.

cefr level, large language model, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.18466

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.91)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Toward LLM-Supported Automated Assessment of Critical Thinking Subskills

Peczuh, Marisa C., Kumar, Nischal Ashok, Baker, Ryan, Lehman, Blair, Eisenberg, Danielle, Mills, Caitlin, Chebrolu, Keerthi, Nashi, Sudhip, Young, Cadence, Liu, Brayden, Lachman, Sherry, Lan, Andrew

arXiv.org Artificial IntelligenceOct-16-2025

Critical thinking represents a fundamental competency in today's education landscape. Developing critical thinking skills through timely assessment and feedback is crucial; however, there has not been extensive work in the learning analytics community on defining, measuring, and supporting critical thinking. In this paper, we investigate the feasibility of measuring core "subskills" that underlie critical thinking. We ground our work in an authentic task where students operationalize critical thinking: student-written argumentative essays. We developed a coding rubric based on an established skills progression and completed human coding for a corpus of student essays. We then evaluated three distinct approaches to automated scoring: zero-shot prompting, few-shot prompting, and supervised fine-tuning, implemented across three large language models (GPT-5, GPT-5-mini, and ModernBERT). GPT-5 with few-shot prompting achieved the strongest results and demonstrated particular strength on subskills with separable, frequent categories, while lower performance was observed for subskills that required detection of subtle distinctions or rare categories. Our results underscore critical trade-offs in automated critical thinking assessment: proprietary models offer superior reliability at higher cost, while open-source alternatives provide practical accuracy with reduced sensitivity to minority categories. Our work represents an initial step toward scalable assessment of higher-order reasoning skills across authentic educational contexts.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.12915

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Assessment & Standards (0.94)
Education > Curriculum > Subject-Specific Education (0.66)
Education > Educational Setting > K-12 Education (0.46)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Proficiency-Aware Adaptation and Data Augmentation for Robust L2 ASR

Sun, Ling, Zhu, Charlotte, Shi, Shuju

arXiv.org Artificial IntelligenceOct-14-2025

General-purpose ASR underperforms for atypical speakers, such as L2 learners, reinforcing bias and limiting use in education and accessibility. Using the CEFR-graded Speak and Improve corpus, we show that naive fine-tuning of Whisper reduces average WER but simultaneously widens disparities and disproportionately harms lower-level learners. To address this, we propose two strategies: (i) proficiency-aware multitask learning, jointly optimizing ASR with proficiency classification, and (ii) targeted augmentation, applying spectrogram masking to low-proficiency speech to counter imbalance. These approaches reduce WER by up to 29.4 percent (relative) and insertion/deletion errors by as much as 58.6 percent (relative). Crucially, despite the severe imbalance of the dataset reflecting real-world distributions, both strategies consistently narrow proficiency gaps, advancing equitable ASR for L2 learners.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.10738

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.98)

Add feedback

GenQuest: An LLM-based Text Adventure Game for Language Learners

Wang, Qiao, Labib, Adnan, Swier, Robert, Hofmeyr, Michael, Yuan, Zheng

arXiv.org Artificial IntelligenceOct-7-2025

GenQuest is a generative text adventure game that leverages Large Language Models (LLMs) to facilitate second language learning through immersive, interactive storytelling. The system engages English as a Foreign Language (EFL) learners in a collaborative "choose-your-own-adventure" style narrative, dynamically generated in response to learner choices. Game mechanics such as branching decision points and story milestones are incorporated to maintain narrative coherence while allowing learner-driven plot development. Key pedagogical features include content generation tailored to each learner's proficiency level, and a vocabulary assistant that provides in-context explanations of learner-queried text strings, ranging from words and phrases to sentences. Findings from a pilot study with university EFL students in China indicate promising vocabulary gains and positive user perceptions. Also discussed are suggestions from participants regarding the narrative length and quality, and the request for multi-modal content such as illustrations.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.04498

Country: Asia > China (0.24)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

An Effective Strategy for Modeling Score Ordinality and Non-uniform Intervals in Automated Speaking Assessment

Lo, Tien-Hong, Chen, Szu-Yu, Sung, Yao-Ting, Chen, Berlin

arXiv.org Artificial IntelligenceSep-23-2025

Abstract--A recent line of research on automated speaking assessment (ASA) has benefited from self-supervised learning (SSL) representations, which capture rich acoustic and linguistic patterns in non-native speech without underlying assumptions of feature curation. However, speech-based SSL models capture acoustic-related traits but overlook linguistic content, while text-based SSL models rely on ASR output and fail to encode prosodic nuances. Moreover, most prior arts treat proficiency levels as nominal classes, ignoring their ordinal structure and non-uniform intervals between proficiency labels. T o address these limitations, we propose an effective ASA approach combining SSL with handcrafted indicator features via a novel modeling paradigm. We further introduce a multi-margin ordinal loss that jointly models both the score ordinality and non-uniform intervals of proficiency labels. Extensive experiments on the TEEMI corpus show that our method consistently outperforms strong baselines and generalizes well to unseen prompts.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.03372

Genre: Research Report (0.82)

Industry: Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.88)

Add feedback